Network analysis of plasma proteomes in affective disorders

The conventional differentiation of affective disorders into major depressive disorder (MDD) and bipolar disorder (BD) has insufficient biological evidence. Utilizing multiple proteins quantified in plasma may provide critical insight into these limitations. In this study, the plasma proteomes of 299 patients with MDD or BD (aged 19–65 years old) were quantified using multiple reaction monitoring. Based on 420 protein expression levels, a weighted correlation network analysis was performed. Significant clinical traits with protein modules were determined using correlation analysis. Top hub proteins were determined using intermodular connectivity, and significant functional pathways were identified. Weighted correlation network analysis revealed six protein modules. The eigenprotein of a protein module with 68 proteins, including complement components as hub proteins, was associated with the total Childhood Trauma Questionnaire score (r = −0.15, p = 0.009). Another eigenprotein of a protein module of 100 proteins, including apolipoproteins as hub proteins, was associated with the overeating item of the Symptom Checklist-90-Revised (r = 0.16, p = 0.006). Functional analysis revealed immune responses and lipid metabolism as significant pathways for each module, respectively. No significant protein module was associated with the differentiation between MDD and BD. In conclusion, childhood trauma and overeating symptoms were significantly associated with plasma protein networks and should be considered important endophenotypes in affective disorders.


INTRODUCTION
The conventional differentiation of affective disorders into major depressive disorder (MDD) and bipolar disorder (BD) is based on the history of (hypo)manic symptoms [1]. As treatment regimens and outcomes differ between these disorders, there has been considerable effort to differentiate these disorders, including the use of biological correlates [2,3]. Top-down biological approaches have expanded our knowledge to facilitate the differentiation of these disorders. However, there are limitations with regard to inconsistency and modest accuracy [4,5]. Understanding affective disorders based on biological correlates with a transdiagnostic bottom-up approach may explain these limitations and deepen our knowledge of the pathophysiology of these disorders.
Proteomics-based research has received growing interest as proteomes reflect biological functions [6]. Recent technological advances have enabled researchers to simultaneously quantify multiple proteins [7]. While previous studies relied on few markers, multiplexing now permits the construction of networks between multiple proteins [8]. These approaches have focused on comparing specific diseases with healthy controls. For instance, a study from the NESDA (Netherlands Study of Depression and Anxiety) constructed networks with 171 blood proteomes to explain the differentiation between MDD and healthy controls [9]. However, to our knowledge, no study to date has applied this approach transdiagnostically in individuals with affective disorders.
In this study, we implemented weighted correlation network analysis to identify biologically meaningful modules of interconnected proteins in plasma samples from individuals with affective disorders, including both MDD and BD. Further analysis was performed to determine meaningful traits associated with these modules and to identify hub proteins from these modules.

MATERIALS AND METHODS Clinical samples
The initial study population comprised 169 patients with MDD and 141 patients with BD from our previous study [10]. In total, 26 patients with BD with a Young Mania Rating Scale (YMRS) total score >12 had been excluded to rule out those with current (hypo)manic symptoms, and 8 patients additionally had been excluded due to missing data of covariates. Patients were enrolled between August 2018 and December 2020 from 6 hospitals, including Seoul National University Hospital (SNUH); Nowon Eulji Medical Center, Eulji University; Seoul Metropolitan Government Seoul National University Boramae Medical Center; Hanyang University Hospital; Inha University Hospital; and Cha University Bundang Medical Center. The diagnosis was based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition  and was confirmed with the Mini-International Neuropsychiatric Interview (MINI). Only patients with Clinical Global Index-Severity (CGI-S) ≥ 3 were included.
Patients were excluded if they had taken any anti-inflammatory analgesic within the 2 preceding weeks; had a history of neurosurgery, central nervous system (CNS) diseases, cancer, and tuberculosis; had a substance use disorder other than alcohol, caffeine, and nicotine; were currently lactating or pregnant; and/or were predicted to have an intellectual disability or difficulty in understanding Korean. These exclusions were predominantly based on previous evidence of known associations between these conditions and protein expression [11][12][13][14][15][16][17][18][19]. Patients with a history of neuromodulation or intensive psychotherapy for the past 2 months were also excluded to confine the effect of treatment to psychotropic medications.
Plasma samples from each individual were collected in a 6-mL ethylenediaminetetraacetic acid (EDTA) tube (ref. 367863, Becton, Dickinson and Company, Trenton, NJ) and centrifuged at 1100-1300g for 10-15 min at 4°C or room temperature. The collected supernatant was stored in Eppendorf tubes at ≤−70°C until usage.
The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008. Informed consent for each participant was obtained. The study was reviewed by the Institutional Review Boards of Seoul National University Hospital (IRB No. 1806-1065-951) and all other participating hospitals.

Demographics and clinical features
The demographics considered in this study were age, sex, body mass index (BMI), blood collection time, fasting time, current alcohol use, exercise status, and smoking status [10]. Both age and BMI were analyzed as continuous variables. Sex (male/female), blood collection time (AM, PM), fasting time (<8 h, ≥8 h), current alcohol use, exercise status, and smoking status (yes/no) were analyzed as dichotomous variables. Alcohol use was defined as at least one drink once per week. Exercise status was classified based on the World Health Organization (WHO) recommendations for moderate-intensity physical activity for at least 30 min once per week [20].
Medication usage was classified as a dichotomous variable for antipsychotics, lithium/anticonvulsants, antidepressants, and benzodiazepines/hypnotics. Disease chronicity was assessed with the duration from first onset and duration from first medication (years).

Plasma proteomic quantification
Specific methods for the following analysis have been described in our previous study [10]. In brief, for 44 μL of each plasma sample, the six highest abundant proteins were depleted with the MARS-6 column (Agilent Technologies, Santa Clara, CA, USA). In total, 100 μg of quantified protein was reduced with 40 μL of 0.2% RapiGest solution and 20 mM dithiothreitol (DTT) at 60°C for 1 h and then alkylated with 20 μL of 100 mM iodoactamide (IAA) in the dark at room temperature for 30 min. Then, the samples were digested with trypsin solution at 37°C for 4 h, and the digestion was completed by adding 10% formic acid. The sample was centrifuged at 15,000 rpm for 1 h at 4°C, and the supernatants were spiked with crude stable isotope-labeled internal standard (SIS) peptides, in which a C-terminal lysine or arginine was heavy isotope-labeled ( 13 C 6 15 N 2 or 13 C 6 15 N 4 ) (purity: crude (>70%), JPT, Berlin, Germany). There were 5 preparation batches, and for each batch, the samples were randomly distributed and assigned identification numbers to blind the researchers throughout the sample preparation. From reviewing databases of psychiatric disorders, established targets for affective disorders, and laboratory-established targets, and after checking for detectability and quantifiability, 642 target peptides were initially selected [10].
Liquid chromatography-multiple reaction monitoring-mass spectrometry (LC-MRM-MS) was performed with a 1260 Infinity HPLC system equipped with a Jetstream electrospray source coupled to an Agilent 6490 triple quadrupole MS (Agilent Technologies, Santa Clara, CA, USA). The sample vials of the autosampler were maintained at 4°C, and the guard/ analytical columns were maintained at 40°C. For each digested sample, 40 μL was injected into a guard column (2.1 × 15.0 mm, 1.8 µm, 80 Å) (Agilent Technologies, Santa Clara, CA, USA). Online desalting was conducted in 3% solvent B (formic acid/acetonitrile (v/v)) at 50 μL/min for 10 min. After the valve position was switched, the sample was transferred to an analytical column (0.5 × 35.0 mm, 3.5 µm, 80 Å) (Agilent Technologies, Santa Clara, CA, USA) in 3% solvent B at 40 µL/min for 5 min. Bound peptides were separated on the column and eluted with a linear gradient of 3% to 35% solvent B at 40 µL/min for 50 min.
Mass spectra were generated in positive ion mode, with the following parameters: 2500 V ion spray capillary voltage, 2000 V nozzle voltage, 5 V cell accelerator voltage, 200 V delta EMV, and 380 V fragmented voltage. The drying gas was sprayed at 15 L/min at 250°C, and the sheath gas flow was 12 L/min at 350°C. The collision energy was optimized by adding the intensities of individual transitions that resulted in the largest peak area. SIS peptides were first pooled and analyzed to evaluate their retention times. The retention times were compared with those of endogenous peptides by analyzing the matrix of endogenous peptides with SIS peptides and a heavy β-galactosidase peptide. Subsequently, the final targets in individual blood samples were quantified. LC-MRM-MS analysis was performed once per sample (1 replicate for each sample).
Raw data from the LC-MRM-MS analysis was processed in Skyline (version 19.1.0) (MacCoss Lab, Seattle, WA, USA). Peptide quantification was calculated with the peak area ratio (PAR), defined as the ratio of endogenous to SIS peptide peak area. From 642 target peptides, 54 unstable peptides with low intensities (intensity < 1000), unequal retention times between light and heavy peptides, and skewed peaks were excluded. Subsequently, the final PAR values of 588 target peptides were normalized to the area of heavy β-galactosidase peptide.

Weighted correlation network analysis
An overview of the analysis is presented in Supplementary Fig. 1. Peptides with PAR values ≤ 0.01 or ≥100 for at least 5% of the final study population were initially excluded, in accordance with our previous study [10]. For proteins with multiple peptides quantified, the mean value was used for further analysis, resulting in a total of 420 proteins. Weighted correlation network analysis was conducted with the WGCNA package in R [31]. Initially, 8 samples were excluded as outliers by hierarchical clustering. Logarithmic transformation followed by batch correction was performed with the Combat algorithm for preparation batches presuming effects of age, sex, and BMI. Linear regression was performed to regress out the effects of age, sex, and BMI, and the residuals were used for further analysis. Three other samples were excluded as outliers by hierarchical clustering after preprocessing. A signed, weighted network was constructed [8] based on Pearson's correlation and was converted with the smallest soft thresholding power to generate a scale-free network (scalefree R 2 ≥ 0.9). After calculating a topological overlap measure (TOM) based on the adjacency matrix, a TOM dissimilarity matrix (1-TOM) was used to perform hierarchical clustering and map a dendrogram. Proteins were divided into different modules using the dynamic tree cut algorithm with the following parameters: deepSplit = 4, minimal module size = 20. Each module was summarized by the first principal component of protein expression across individuals, referred to as the module eigenprotein. Modules were merged if the difference between their module eigenprotein profiles was <0.25. The correlations between modules (represented with the module eigenprotein) and demographic/clinical traits were analyzed with Pearson correlation analysis. The association of modules with hospital type was assessed using ANOVA. Additional analysis to control for significant demographic traits when comparing the association between modules and clinical traits was performed using linear regression.
Module membership was defined as the correlation between protein expression profiles and the module eigenprotein [8]. For each protein, a protein significance measure was estimated as the absolute value of correlation between the expression profile and specific traits to identify proteins strongly associated with the traits [8]. The correlation between module membership and protein significance was analyzed to examine the presence of a linear relationship. Additionally, the network results were imported to Cytoscape software (Version 3.9.1) to select the top 10 hub proteins (intramodular-connected). This was performed based on S.J. Rhee et al.
Proteins from significant modules were entered in the DAVID bioinformatics resource tool (https://david.ncifcrf.gov/) for gene ontology analysis with the official gene names [34]. The gene ontology classification for biological processes, cellular components, and molecular functions was conducted using Fisher's exact test, with a cut-off value of 0.05. Additionally, pathway analysis was performed using the KEGG (Kyoto Encyclopedia of Genes and Genomes) database (http://www.genome.jp/ kegg/) [35].
Finally, network stability was assessed by iterating network reconstruction using the same settings while only including 63% of randomly sampled patients from the final study population, which was repeated 100 times with the sampledBlockwiseModules function. For each protein, consistency was calculated as the percentage of the 100 sub-samplings in which the protein was assigned to the same module. The stability of each module was defined as the average consistency of all proteins in the given module.

Statistical analysis
Statistical analysis of demographic/clinical characteristics and targeted proteomic data was performed with SPSS version 21.0 and R version 4.1.2. Statistical tests were two-tailed. P-values < 0.05 were considered statistically significant. For functional analysis, the Benjamini-FDR corrected q-value was calculated.

Demographics and clinical characteristics of the final population
The demographic and clinical characteristics of the study population after excluding outliers are presented in Table 1 Weighted correlation network analysis A signed weighted network was constructed based on Pearson's correlation with soft thresholding power = 11 for a scale-free network (scale-free R 2 ≥ 0.9) (Fig. 1a). In total, 6 protein modules were identified (excluding the gray module), and no modules needed to be merged (Fig. 1b, Supplementary Fig. 2, Supplementary Table 1). Module-trait relationships analysis revealed that the eigenproteins of the brown module (68 proteins), turquoise module (100 proteins), green module (38 proteins), blue module (84 proteins), and yellow module (40 proteins) were associated with the total score of the CTQ (r = −0.15, p = 0.009); sampling time, antidepressant usage, and overeating item of the SCL-90-R (SCL-60) (r = 0.16, p = 0.006); antipsychotic usage, antidepressant usage, and total score of the YMRS (r = −0.15, p = 0.008); total score of the CTQ (r = −0.12, p = 0.03); and exercise (r = 0.12, p = 0.04), respectively (Fig. 2). Linear regression analysis revealed that the eigenprotein of the turquoise module remained statistically significant for the overeating item of the SCL-90-R after controlling for sampling time and antidepressant usage (t = 2.400, p = 0.017).
Additional analysis of hospital type using ANOVA revealed that only the green module was significantly associated with hospital type (F = 4.356, p = 0.001). After controlling for hospital type, the eigenprotein of the green module was no longer statistically significant with the total YMRS score (t = −1.804, p = 0.07).
Analysis was conducted with each individual item of the MADRS. The brown and turquoise modules did not exhibit statistical significance for any individual item of the MADRS. The green module exhibited a statistical significance for MADRS-4 (reduced sleep) and MADRS-6 (concentration difficulty), but no significant association was noted after controlling for hospital type.
After considering the strength of the associations between modules and traits, and as the green module was influenced by hospital type, further analysis was performed for the brown and turquoise modules.
Hub protein identification of the brown and turquoise module In the brown module, module membership and protein significance for the total score of CTQ exhibited a significant linear relationship. In the turquoise module, module membership and protein significance for the overeating item of SCL-90-R (SCL-60) exhibited a significant linear relationship (Fig. 3a). The top 10 hub proteins in these modules based on MCC are illustrated in Fig. 3b, and the module membership for each protein is presented in Supplementary Table 2. Complement components and apolipoproteins were dominant in the brown and turquoise modules, respectively.

Functional annotation of the brown and turquoise module
For the brown module, significant terms were related to immune responses. For the turquoise module, significant terms were predominantly related to lipid metabolism. The detailed terms and pathways of the functional analysis are summarized in Table 2, and more comprehensive results are presented in Supplementary  Table 3.

Module stability
Network stability analysis revealed that reproducible proteinmodule assignments had averages of 81.35% and 82.77% for proteins in the brown and turquoise modules, respectively ( Supplementary Fig. 3).

DISCUSSION
In this study of individuals with affective disorders, weighted correlation network analysis revealed six protein modules (excluding the gray module) in which two protein network modules were significantly associated with childhood trauma and overeating. The brown module with childhood trauma and the turquoise module with overeating exhibited significant associations between module membership and protein significance. Key proteins and functions were identified, with the lack of any association between the differentiation of MDD and BD based on the constructed protein network modules. Compared to other studies conducted using peripheral blood [9,36], simultaneous multiplex quantification of numerous proteins enabled us to conduct an analysis using a larger number of proteins. In addition, the effects of age, sex, and BMI of protein masses were discarded to reduce the likelihood of identifying associations with factors other than psychopathology [37,38], which was also in line with previous weighted correlation network analyses [39][40][41][42]. This approach enabled us to identify psychopathological traits associated with correlated network-based mechanisms. In this regard, an analysis using a single or few individual protein marker(s) seems insufficient to encompass complex biological pathways.
The brown module with 68 proteins was associated with the degree of childhood trauma. Several complement components were hub proteins within this network, alongside other proteins such as ITIHs and ANT3. Hub proteins with the highest connectivity were C1R, C1S, and ANT3. C1R and C1S are the proteases responsible for the activation and proteolytic activity of the C1 complex [43]. ANT3 is a serine protease inhibitor that suppresses coagulation and hemostasis, and inflammation [44]. The main roles of these proteins are to regulate complement and coagulation pathways, which are known to interact with each other in BD [45] and MDD [46]. In addition, there is evidence that these pathways are dysregulated even from childhood before patients develop psychotic symptoms [47]. Dysregulation is also observed in those with post-traumatic stress disorder [48], and there was a report that multiple traumatic events were directly proportional to the degree of systemic inflammation [49]. Our result is in line with these studies and proposes that key proteins that regulate complement and coagulation pathways are actually associated with childhood trauma. Although evidence of immune dysfunction and inflammation in adults with childhood trauma among individuals with affective disorders is accumulating [50], previous studies have focused on glucocorticoid pathways and interleukins [51][52][53][54]. Our study expands previous findings to network-level protein co-expression.
The presence of childhood trauma is known to increase the severity of these disorders and is associated with greater treatment resistance and a higher relapse rate [55][56][57]. A recent randomized control trial of infliximab for bipolar depression revealed that this treatment was effective in individuals with childhood trauma [58]. The present finding may explain why individuals with childhood trauma are resistant to conventional therapies. In this regard, agents with anti-inflammatory and immune function properties that target the co-expression of these proteins may be beneficial. Further investigation in a longitudinal design to determine the association between treatment-resistant depression and the co-expression of complement and coagulation pathways would be informative.
The turquoise module with 100 proteins was associated with overeating. Accumulating evidence based on genomics, transcriptomics, and proteomics suggests that individuals of MDD with hyperphagia have more dominant inflammatory-metabolic properties [59][60][61]. Our study expanded these findings to a transdiagnostic affective disorder population, including both MDD and BD. Furthermore, our results highlighted an interplay between these proteins rather than the involvement of individual proteins. When comparing hyperphagia-associated proteomic modules from our study with the NESDA study, APOB was the only common hub protein [9]. Our study removed the effects of age, sex, and BMI, and the proteins that were actually quantified had discrepancies, which would have yielded different networks. In our study, apolipoproteins were the major hub proteins of the hyperphagiaassociated module.
The key protein with the highest connectivity in this module was APOA2 (Apolipoprotein A-II), which is the second most abundant apolipoprotein in high-density lipoprotein particles [62]. Its alteration is known to be involved in various conditions, including metabolic syndrome and insulin resistance [63]. Considering psychiatric studies, its level was associated with metabolic syndrome in schizophrenia [64], associated with psychotropic medication in suicide completers [65], and altered in the prodromal stage of BD [66]. These studies did not specifically evaluate appetite, and direct comparison is limited, but some results do suggest that a metabolic subtype could be involved. APOA2 is related to tryptophan metabolism, which is not only related to appetite but also to depression, as tryptophan is the main precursor of serotonin [67,68].
Several studies have identified an association between affective disorders and individual apolipoprotein markers [69][70][71]. However, research with individual apolipoproteins in affective disorders has mainly focused on cognition [72,73], and the association with hyperphagia warrants further investigation. Nevertheless, an association between atypical depression and lipid metabolism has been reported [74]. The current evidence of these individual associations may not reflect complex protein-level networks, underscoring the need for further network-level-based studies. Further analysis entangling the sophisticated association between apolipoproteins, appetite, and depression should be investigated in the future.
On the other hand, no module was associated with the differentiation of MDD and BD. Most of the proteins from our multi-protein model in our previous study that differentiated MDD and BD were dispersed in several protein modules [10]. As there are various mechanisms that are related to the differentiation of these disorders, they probably would not be captured in a single protein network. Additionally, this implies that without the consideration of childhood trauma and hyperphagia, the differentiation of MDD and BD based on circulating proteomics has limitations. The need to focus on the degree of childhood trauma and overeating may be required, not only in biological studies but also in clinical settings, as biological treatment responses could be affected by these protein networks. Interestingly, despite several controversies, both childhood trauma and hyperphagia have been proposed as risk factors for conversion from MDD to BD [75,76]. Therefore, not only should we consider the presence of (hypo) manic symptoms, but also consider these clinical traits, or integrate these clinical traits with biological correlates, to precisely phenotype those with affective disorders [77,78].

Strengths and limitations
The following limitations of the study should be considered. First, the network was based on peripheral proteins; hence, the functional analysis has limitations in interpretation. Despite evidence of blood-brain barrier dysfunction in psychiatric disorders, peripheral blood does not always reflect the CNS. In this regard, our data suggest that childhood trauma and overeating in affective disorders are associated with systematic immune function and lipid metabolism. Second, other traits may not have been considered. Although various scales were utilized, other known factors that are known to help differentiate MDD from BD, including familial history [79] and treatment-resistant depression [80], need to be analyzed in the future. Third, there may have been other proteins that were not covered. However, the number of proteins quantified in this study was substantially larger than that in previous studies, and analysis was performed on a targeted basis method. Fourth, the interpretation of causality is limited due to the cross-sectional design. This study is based on correlation interference and not on causal interference. Longitudinal analysis with multiple measurements of clinical traits and plasma proteomes will enable investigation of the preservation of proteomic networks and their relationship with clinical traits. Fifth, sex-stratified analysis could not be interpreted due to decreased stability of the co-expression network. Although we discarded the effect of sex, it is known that males and females have distinct innate and adaptive responses [81]. A larger sample size could overcome this limitation in the future. Finally, independent validation was not performed in this study. However, subsampling was performed on the protein modules to reveal a rather stable network.  Nevertheless, to our knowledge, this study is the first to construct plasma protein networks with numerous proteins and to apply these networks to a transdiagnostic population of affective disorders. As proteins reflect biological functions, the degree of childhood trauma and overeating should be considered as candidates for endophenotypes in affective disorders. Further studies, including replication and longitudinal designs, will enable us to verify these results.

DATA AVAILABILITY
The raw MRM-MS files, including quantitative MS spectra for 454 peptides/420 proteins for all 299 plasma samples, were deposited into PeptideAtlas (Dataset identifier: PASS04825, Password: SA5392tsq). The other datasets presented in this study may be available from the corresponding authors upon reasonable request.